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■ Abstract 

<U 1 In this paper, we survey the rich theory of infinite episturmian words which generalize to 

any finite alphabet, in a rather resembling way, the well-known family of Sturmian words on 
two letters. After recalling definitions and basic properties, we consider episturmian mor- 
phisms that allow for a deeper study of these words. Some properties of factors are described, 
including factor complexity, palindromes, fractional powers, frequencies, and return words. 
, We also consider lexicographical properties of episturmian words, as well as their connection 

U' 



to the balance property, and related notions such as finite episturmian words, Arnoux-Rauzy 
sequences, and "episkew words" that generalize the skew words of Morse and Hcdlund. 
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in ' 1 Introduction 

1.1 From Sturmian to episturmian 

Most renowned amongst the branches of combinatorics on words is the theory of infinite binary 
sequences called Sturmian words, which are fascinating in many respects, having been studied from 
combinatorial, algebraic, and geometric points of view. Their beautiful properties are related 
to many fields such as Number Theory, Geometry, Symbolic Dynamical Systems, Theoretical 
Physics, and Theoretical Computer Science (see [3 [831 [96] f° r recent surveys). 
rS | Since the seminal works of Morse and Hedlund [91], Sturmian words have been shown to 

admit numerous equivalent definitions and characterizations. For instance, it is well known that 
an infinite word w over {a, b} is Sturmian if and only if w is aperiodic and balanced: for any two 
factors u, v of w of the same length, the number of a's in each of u and v differs by at most 
1. Sturmian words are also characterized by their factor complexity function (which counts the 
number of distinct factors of each length): they have exactly n + 1 distinct factors of length n 
for each n. In this sense, Sturmian words are precisely the aperiodic infinite words of minimal 
factor complexity since, as is well known, an infinite word is ultimately periodic if and only if 
it has less than n + 1 factors of length n for some n (see |37j). Many interesting properties of 
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Sturmian words can be attributed to their low complexity, which induces certain regularities in 
such words without, however, making them periodic. Sturmian words can also be geometrically 
realized as cutting sequences by considering the sequence of 'cuts' in an integer grid made by a 
line of irrational slope (see for instance [SHIE]). They also provide a symbolic coding of the orbit 
of a point on a circle with respect to a rotation by an irrational number (see [HUE]). 

All of the above characteristic properties of Sturmian words lead to natural generalizations 
on arbitrary finite alphabets. In one direction, the balance property naturally extends to an 
alphabet with more than two letters (e.g., see [68l 11101 1115] ) as does the following generalized 
balance property that also characterizes Sturmian words (see |49l [1] ) : the difference between the 
number of occurrences of a word u in any pair of factors of the same length is at most 1 . In another 
direction, we could consider relaxing the minimality condition for the factor complexity p{n). For 
example, quasi- Sturmian words are infinite words for which there exist two positive integers N 
and c such that n + 1 < p(n) < n + c for all n > N. This generalization was introduced in [5] when 
studying the transcendence of certain continued fraction expansions. See also [3T1 [36l [66] 1105] for 
similar extensions of Sturmian words with respect to factor complexity. From the geometric point 
of view, cutting sequences naturally generalize to trajectories in the hypercube billiard (e.g., see 
[25]). and codings of rotational orbits carry over to codings of interval exchange transformations 
(e.g., see pj). 

Two other very interesting natural generalizations of Sturmian words are Arnoux-Rauzy se- 
quences \\.2\ 197] and episturmian words [43 \ 173]. which we will now define. 

From the factor complexity of Sturmian words, it immediately follows that any Sturmian word 
is over a 2-letter alphabet and has exactly one left special factor of each length. A factor u of a 
finite or infinite word w is said to be left special (resp. right special) in w if there exists at least 
two distinct letters a, b such that au and bu (resp. ua, ub) are factors of w. Extending the left 
special property of Sturmian words, a recurrent infinite word w over a finite alphabet A is said 
to be an Arnoux-Rauzy sequence (or a strict episturmian word) if it has exactly one left special 
factor and one right special factor of each length, and for every left (resp. right) special factor u 
of w, xu (resp. ux) is a factor of w for all letters x 6 A. A noteable property that is shared by 
Sturmian words and Arnouxy-Rauzy sequences is their closure under reversal, i.e., if u is a factor 
of such a word, then its reversal is also a factor. This nice property inspired Droubay, Justin, and 
Pirillo's generalization of Sturmian words in [33]: an infinite word is episturmian if it is closed 
under reversal and has at most one left special factor of each length. Sturmian, Arnoux-Rauzy, 
and episturmian words all have standard (or characteristic) elements, which are those having all 
of their left special factors as prefixes. Within these families of words, standard words are good 
representatives in the sense that an infinite word belongs to one such family if and only if it has 
the same set of factors as some standard word in that family. 

From the definitions, it is clear that the family of Arnoux-Rauzy sequences is a particular 
subclass of the family of episturmian words. More precisely, episturmian words are composed of 
the Arnoux-Rauzy sequences, images of the Arnoux-Rauzy sequences by episturmian morphisms, 
and certain periodic infinite words (see Section [5]). In the 2-letter case, Arnoux-Rauzy sequences 
are exactly the Sturmian words whereas episturmian words include all recurrent balanced words, 
i.e., periodic balanced words and Sturmian words. 

The study of episturmian words and Arnoux-Rauzy sequences has enjoyed a great deal of 
popularity in recent times, owing mostly to the many properties that they share with Sturmian 
words. In this paper we survey the purely combinatorial work on episturmian words, beginning 
with their definition and basic properties in Section [2j Then, in Section EJ we recall episturmian 
morphisms which allow for a deeper study of episturmian words. In particular, any episturmian 
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word is the image of another episturmian word by some so-called pure episturmian morphism. 
Even more, any episturmian word can be infinitely decomposed over the set of pure episturmian 
morphisms. This last property allows an episturmian word to be defined by one of its morphic 
decompositions or, equivalently, by a certain directive word, which is an infinite sequence of rules 
for decomposing the given episturmian word by morphisms. In Section U] we consider notions 
such as shifts, spins, and block- equivalence in connection with directive words, which allow us 
to study when two different spinned infinite words direct the same episturmian word. We also 
consider periodic and purely morphic episturmian words. In Section \5\ our discussion briefly turns 
to Arnoux-Rauzy sequences and finite episturmian words. Following this, we study in Section [6] 
some properties of factors of episturmian words (and Arnoux-Rauzy sequences), including factor 
complexity, palindromes, fractional powers, frequencies, and return words. Lastly, we consider 
more recent work involving lexicographic order and the balance property (including Fraenkel's 
conjecture) . 

1.2 Notation & terminology 

We assume the reader is familiar with combinatorics on words and morphisms (e.g., see [821 183j ). 
In this section, we recall some basic definitions and properties relating to episturmian words which 
are needed throughout the paper. For the most part, we follow the notation and terminology of 
031 EH [751 [62]. 

Let A denote a finite alphabet, i.e., a non-empty finite set of symbols called letters. A finite 
word over A is a finite sequence of letters from A. The empty word e is the empty sequence. Under 
the operation of concatenation, the set .4* of all finite words over A is a free monoid with identity 
element e and set of generators A. The set of non-empty words over A is the free semigroup 
A+ := A* \ {e}. 

A right-infinite (resp. left-infinite, bi-infinite) word over A is a sequence indexed by N + 
(resp. Z \ N + , Z) with values in A. For instance, a left-infinite word is represented by u = 
• • • 6_2&_i&o an d a right-infinite word by v = 6162^3 • • • where G A. The concatenation of u 
and v gives the bi-infinite word u.v = • • • 6_2^-i^o-^i^2^3 ■ ■ ■ with a dot written between &o an d 
bi to avoid ambiguity. For easier reading, infinite words are hereafter typically typed in boldface 
to distinguish them from finite words. 

The shift map T is defined for bi-infinite words b = (fci)iez by T(b) = (&i+i)igz and its k-ih 
iteration is denoted by T k . This extends to right-infinite words for k > and left-infinite words 
for k < 0. For finite words w G A* , the shift map T acts circularly, i.e., if w = xv where x G A, 
then T(w) = vx. 

The set of all right-infinite words over A is denoted by A u , and we define A°° := A* U A^. 
An ultimately periodic right-infinite word can be written as uv^ = uvvv ■ ■ ■ , for some u, v G A*, 
v 7^ e. If u = e, then such a word is periodic. A right-infinite word that is not ultimately periodic 
is said to be aperiodic. 

Given a finite word w = X1X2 • • • x m G .A* with each xi G .A, the length of w, denoted by \w\, 
is equal to m. By convention, the empty word e is the unique word of length 0. The number 
of occurrences of a letter a in w is denoted by \w\ a . If |u>| = 0, then w is said to be a- free. 
The reversal w of w is its mirror image: w = x m x m ^i ■ ■ ■ x\, and if w = w, then w is called a 
palindrome. The reversal operator naturally extends to bi-infinite words; that is, the reversal of 
the bi-infinite word b = l.r, with 1 left-infinite and r right-infinite, is given by b = r.l. 

A finite word w is a factor of a finite or infinite word z if z = uwv for some words u, v (which 
are finite or infinite depending on z). In the special case u = e (resp. v = e), we call w a prefix 
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(resp. suffix) of z. We use the notation p~ l w (resp. to" 1 ) to indicate the removal of a prefix p 
(resp. suffix s) of a finite word w. Note that a prefix or suffix u of a finite word it; is said to be 
proper if u ^ w. A factor u of a finite or infinite word w is right (resp. left) special if ua, ub 
(resp. aw, bu) are factors of w for some letters a, b G A, a 7^ 6. 

For any finite or infinite word io, i^io) denotes the set of all its factors. Moreover, the alphabet 
of w is Alph(io) := -F(u;)n*4 and, if u; is infinite, we denote by Vlt(w) the set of all letters occurring 
infinitely often in w. Any two infinite words x, y are said to be factor- equivalent if F(x) = F(y), 
i.e., if x and y have the same set of factors. 

A factor of an infinite word x is recurrent in x if it occurs infinitely often in x, and x itself is 
said to be recurrent if all of its factors are recurrent in it. For a bi-infinite word to be recurrent, 
any factor must occur infinitely often to the left and to the right. An infinite word is said to be 
uniformly recurrent if any factor occurs infinitely many times in it with bounded gaps |37j . 

A morphism cp on A is a map from A* to A* such that <p{uv) = (p(u)<p(v) for any words u, 
v over A. A morphism on A is entirely defined by the images of letters in A. All morphisms 
considered in this paper will be non-erasing: the image of any non-empty word is never empty. 
Hence the action of a morphism ip on A* can be naturally extended to infinite words; that is, if 
x = X1X2X3 • • • G A u , then /(x) = f(x±)f(x2)f(xs) ■ ■ ■ . An infinite word x can therefore be a 
fixed point of a morphism ip, i.e., y?(x) = x. If ip is a (non-erasing) morphism such that (p(a) = aw 
for some letter a & A and w G A + , then <p n {a) is a proper prefix of the word ip n+1 (a) for each 
n G N, and the limit of the sequence {p n {a))n>o is the unique infinite word: 

w = lim (p n (a) = <p u (a) (= awip(w)(p 2 (w)ip 3 (w) ■ ■ ■). 

n~ >oo 

Clearly, w is a fixed point of ip and we say that w is generated by <p. Furthermore, an infinite 
word generated by a morphism is said to be purely morphic. 

In what follows, we will denote the composition of morphisms by juxtaposition as for concate- 
nation of words. 

2 Definitions &; basic properties 

In the initiating paper [33], episturmian words were defined as an extension of standard epis- 
turmian words, which were first introduced as a generalization of standard (or characteristic) 
Sturmian words using iterated palindromic closure (a construction due to de Luca [41]). Here 
we choose instead to begin with the following definition for deriving the main basic properties of 
episturmian words. 

Definition 2.1. [33] An infinite word t G A^ is episturmian if F(t) is closed under reversal and 
t has at most one left special factor ( or equivalently, right special factor) of each length. Moreover, 
an episturmian word is standard if all of its left special factors are prefixes of it. 

Note. We can equivalently consider left or right special factors in the first part of the above 
definition since, by closure under reversal, a factor is left (resp. right) special if and only if its 
reversal is right (resp. left) special. 

Remark 2.2. When \A\ = 2, Definition 12.11 gives the (aperiodic) Sturmian words, as well as the 
periodic balanced infinite words (also known as the periodic Sturmian words). See for instance [62] 
or Section [7.11 
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The following theorem collects together some useful characteristic properties of standard epis- 
turmian words. Before stating it, let us first recall the some definitions. 

Given two palindromes p, q, we say that q is a central factor of p if p = wqw for some w £ A*. 
The palindromic right-closure u/ + ) of a finite word w is the (unique) shortest palindrome having 
i»asa prefix (see [H])- That is, = wv~ 1 w where v is the longest palindromic suffix of w. 
For example, (race)^ = race car. The iterated palindromic closure function [71], denoted by 
Pal, is defined recursively as follows. Set Pal{e) = e and, for any word w and letter x, define 
Pal(wx) = (Pal(w)x)^ . For instance, Pal(abc) = (Pal(ab)c)^ = (abac)^ = abacaba. (See 
Sections 14.11 and 16.2.11 for further insight about palindromic closure.) 

Theorem 2.3. For an infinite word s G A u , the following properties are equivalent. 

i) s is standard episturmian. 

ii) Any first occurrence of a palindrome in s is a central factor of some palindromic prefix of s 
(property Pi). 

in) If w is a prefix ofs, then i s a l so a prefix of s (property Al). 

iv) There exists an infinite word A = x±X2--- (x-j G .4), called the directive word of s, such 
that s = lim n ^ 00 Pal{x\ ■ ■ ■ x n ). 

Remark 2.4. The palindromes Pal(x± ■ ■ ■ x n ) are very often denoted by « n +i in the literature 
(and we will sometimes use the latter notation when convenient). By construction, these palin- 
dromes are exactly the palindromic prefixes of s. Moreover, s is uniquely determined by the 
directive word A. 

Proof of Theorem \2.3[ i) =^ ii): Let s = upt, u G .A*, t G A u showing the first occurrence of 
some palindrome p in s. Suppose p is not the central factor of a palindromic prefix. Then we have 
s = vxwpwyt 1 ', x / y G A. By the reversal property, ywpwx G F(s), thus wpw is left special, 
hence is a prefix of s. Thus p has another occurrence strictly on the left of the considered one, a 
contradiction. 

i) =^ Hi): If Hi) is false, let w = ux, with u G .A* and x G A, be the shortest prefix of s such 
that is not a prefix of s . Thus is a prefix of s. If u were not a palindrome then w would 
be a prefix of ; whence = , a contradiction. Thus u is a palindrome. Now let q be the 
longest palindromic suffix of w. Then = w\qwi = ww\ where w = w\q, and = w\qfyg 
and w\qf z is a prefix of s for some y z £ A, f, g € A*. Hence yfq G F(w) C ^(s) and 
zfq G F(s). Therefore fq is a left special prefix of s. As qf is a prefix of w = xu, x _1 qf is a 
prefix of ti, hence x~ l qfa is a prefix of u for some letter a. So we have x~ l qfa = fq, whence 
a = x and qfx = xfq. This word is a palindrome and, as it is a suffix of w, this contradicts the 
minimality of \q\. 

Hi) =^ w): Trivial. 

At this stage, we have proved that standard episturmian words satisfy ii) , Hi) , iv) . The equiv- 
alence of these three properties is proved in [431 Theorem 1]. Finally, if s satisfies them, then 
F(s) is closed under reversal and by (43J Proposition 5] all of its left special factors are prefixes 
of it, thus s is standard episturmian. □ 

Remark 2.5. Hereafter, we adopt "epistandard" as a shortcut for "standard episturmian", as in 
[64\ [99l UOlj . Also, unless stated otherwise, the notation A = X1X2X3 ■ ■ ■ (x{ G A) will remain for 
the directive word of an epistandard word s. 
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Example 2.6. The epistandard word directed by A = (a6c) w is known as the Tribonacci word 
(or Rauzy word [97]); it begins in the following way: 

r = abacabaabacababacabaabacabacabaabaca • • • , 

where each palindromic prefix Pal(x\ ■ ■ ■ x n ) is followed by an underlined letter x n . More generally, 
for k > 2, the k-bonacci word is the epistandard word over {ai, . . . , o^} directed by (a±a2 • ■ ■ 
(e.g., see [59]). 

Note. For recent studies of the properties of Tribonacci word, see for instance [57\ 1107] and the 
chapter by Allouche and Berthe in [84j . 

2.1 Equivalence classes 

In [13], an infinite word t 6 A u was said to be episturmian if F(t) = F(s) for some epistandard 
word s. This definition is equivalent to Definition 12.11 by Theorem 5 in [33] . Moreover, it was 
proved in [43] that episturmian words are uniformly recurrent, by showing that this nice property 
is implied by iv) of Theorem 12.31 Thus, ultimately periodic episturmian words are (purely) 
periodic. The aperiodic episturmian words are exactly those episturmian words with exactly one 
left special factor of each length. 

In each equivalence class of episturmian words (i.e., same set of factors), there is one epistan- 
dard word in the aperiodic case and two in the periodic case, except if this word is a u with a 
a letter. For example, s\ = {abac)^ has directive word Ai = abc^ and S2 = {acab) w is directed 
by A2 = acb^ . Both si and S2 are standard with the same factors. Theorem 14.81 in Section [4.31 
demonstrates why this is true in general (see also Remark 14. 10[) . 

2.2 Bi-infinite episturmian words 

Definition 12.11 can be extended to bi-infinite words, in which case we must assume they are recur- 
rent. (As is well known, recurrence follows automatically from closure under reversal in the case of 
right-infinite words; see for instance [29j for a proof of this fact.) Bi-infinite words are sometimes 
more natural because in particular they can be shifted in both directions, allowing for simpler 
formulations. More specifically, a (right-infinite) episturmian word t can be prolonged infinitely 
to the left with the same set of factors, i.e., remaining in the same equivalence class. There are 
several or one such prolongation according to whether or not t = T*(s), with s epistandard and 
i > (see [731E5]). 

Note. Hereafter, 'infinite word' should be taken to mean a right-infinite word, whereas left-infinite 
and bi-infinite words will be explicitly referred to as such. 

2.3 Strict episturmian words 

An epistandard word s E A u , or any factor-equivalent (episturmian) word t, is said to be B-strict 
(or k-strict if \B\ = k, or strict if B is understood) if Alph(A) = Ult(A) = BCl That is, 
an episturmian word is strict if every letter in its alphabet occurs infinitely often in its directive 
word. 

The fc-strict episturmian words are precisely the episturmian words t having exactly one left 
special factor of each length and for which any left special factor u in t has k = \A\ different 
left extensions in t (i.e., xu is a factor of t for all letters x in the /c- letter alphabet A). As 
a consequence, fc-strict episturmian words have factor complexity {k — l)n + 1 for each n £ N 
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(see [321 Theorem 7]); such words are exactly the k- letter Arnoux-Rauzy sequences, the study of 
which began in [12] (see also [741 1105] for example). In particular, the 2-strict episturmian words 
correspond to the (aperiodic) Sturmian words. Arnoux-Rauzy sequences will be discussed further 
in Section [5j 

Remark 2.7. A noteworthy fact is that an episturmian word is periodic if and only if |Ult(A)| = 1 
(see [73, Proposition 2.9]). The exact form of a periodic episturmian word is given by Theorem l4.15l 
in Section I4.4L We first need to consider episturmian morphisms. 

3 Episturmian morphisms 

From Lemma 4 in [33], if s is epistandard with first letter a = x±, then a is separating for s and 
its factors, i.e., any factor of s of length 2 contains the letter a. Any episturmian word t that is 
factor-equivalent to s also has separating letter a, and hence can be factorized with a code: 

{a} U a(A \ {a}) if t begins with a, 
{a} U (A \ {a})a otherwise. 

This leads to episturmian morphisms, which were introduced by Justin and Pirillo [73] in order 
to study deeper properties of episturmian words. As we shall see in Section 13.21 episturmian 
morphisms are precisely the morphisms that preserve the set of aperiodic episturmian words (i.e., 
the morphisms that map aperiodic episturmian words onto aperiodic episturmian words). Such 
morphisms naturally generalize to any finite alphabet the Sturmian morphisms on two letters. 
A morphism ip is said to be Sturmian if </?(s) is Sturmian for any Sturmian word s. The set of 
Sturmian morphisms over {a, b} is closed under composition, and consequently it is a submonoid of 
the endomorphisms of {a, b}* . Moreover, it is well known that the monoid of Sturmian morphisms 
is generated by the three morphisms: (a i— ► ab, b •— > a), (an ba.b \— ► a), (a i— > b, b i— >• a) and that 
Sturmian morphisms are precisely the morphisms that map Sturmian words onto Sturmian words 
(see [El ED). 

3.1 Generators & monoids 

By definition (see [431 173]). the monoid of all episturmian morphisms £ is generated, under 
composition, by all the morphisms: 

• tpa- V'a(o) — a i i^a{x) = ax for any letter i/a; 

• -ipa- V'a(a) = $a(x) = xa for any letter i/a; 

• 6 a b: exchange of letters a and b. 

Note. This system of generators is far from minimal, e.g., ip a = dab^b^ab: but gives simpler 
formulae. 

Moreover, the monoid of so-called epistandard morphisms S is generated by all the ip a and the 
9 a b, and the monoid of pure episturmian morphisms £ p (resp. pure epistandard morphisms S p ) is 
generated by the ip a and tp a only (resp. the tp a only). The monoid V of the permutation morphisms 
(i.e., the morphisms <p such that tp(A) = A) is generated by all the 9 a b- The importance of the 
monoid of pure episturmian morphisms will become clearer in the next section where we shall see 
that such morphisms are strongly linked to spinned directive words of episturmian words, which 
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can be viewed as infinite sequences of rules for decomposing episturmian words by morphisms 
(see Theorems 13. II and 13. 31 to follow). In particular, any episturmian word is the image of another 
episturmian word by some pure episturmian morphism. 

The following diagram illustrates the inclusions between the monoids defined above. 




V 




s 





Sp 




£ 

Up 






Semidirect products: S = S p x V, £ 



SpXiV 



We note in particular that the monoid £ is a semidirect product of the submonoids of its 
pure morphisms and of its permutations. Consequently, any episturmian morphism if £ £ can be 
expressed in a unique way as ip = irn = li'tt, where fx, fi' are pure episturmian morphisms and it 
is a permutation. 

Note. The episturmian morphisms are exactly the Sturmian morphisms when \A\ = 2. 

Clearly, all episturmian morphisms on A can be viewed as automorphisms of the free group 
generated by A (e.g., see [5JJ ESI ES EE]) an d it follows that they are injective and that the 
monoids £ and S are left cancellative (see [99} Lemma 7.2]) which means that for any episturmian 
morphisms /, g, h, if fg = fh then g = h. Other fundamental properties of episturmian morphisms 
will be discussed in the next section and in Section [U For an in-depth study of some further 
properties of these morphisms, the interested reader is referred to Richomme's paper [99], in which 
he considers invertibility, presentation, cancellativity, unitarity, characterization by conjugacy, and 
so on. Most of the results in [99] naturally generalize those already known for Sturmian morphisms, 
but some new ones are also proved, such as a characterization of episturmian morphisms that 
preserve palindromes. In [100|, 1103] . Richomme also characterized the episturmian morphisms 
that preserve finite and infinite Lyndon words and those that preserve a lexicographic order on 
words. 



3.2 Relation with episturmian words 

We now state two insightful characterizations of epistandard and episturmian words, which show 
that any episturmian word can be infinitely decomposed over the set of pure episturmian mor- 
phisms. 

In the 'standard' case: 

Theorem 3.1. |73} Corollary 2.7] An infinite word s £ A u is epistandard if and only if there 
exists an infinite word A = x\X2 ■ ■ ■ over A and a sequence (sW)j>o of recurrent infinite words 
such that s(°) = s and s^ -1 ) = tf) Xi (s^) for i > 0. □ 



8 



In [73], Justin and Pirillo showed that the infinite word A appearing in the above theorem 
is exactly the directive word of s that arises from the equivalent definition of epistandard words 
given in Theorem 12.31 In the binary case, the directive word A is related to the continued fraction 
expansion of the slope of the straight line represented by a standard word (see Chapter 2 in [83j). 

Example 3.2. Recall the Tribonacci word r, which has directive word A = (abc)^ . We have 
r = 'ip a (r^), where is directed by T(A) = (bca) u . Notice that = 7r(r) with tt = (abc); a 
very particular case. 

More generally, the following result (Theorem I3.3[) extends the notion of a directive word to 
all episturmian words. Before stating the theorem, we need to introduce some more notation. 
First we define a new alphabet, A := {x \ x G .A}. A letter x G A is considered to be x with 
spin 1, whilst x itself has spin 0. The notion of a spin provides a convenient way to call upon 
the elementary pure episturmian morphisms ip x and ip x . Moreover, as well shall see in Section [4T, 
it allows us to derive many properties of episturmian words from episturmian morphisms (as a 
consequence of the next theorem) . This approach is used for instance in [231 USB ED HOlj 11021 1105] 
and of course in the papers of Justin et al. 

A finite or infinite word over A U A is said to be a spinned word. Given a finite or infinite 
word w = x\X2 • ■ ■ over A, we sometimes denote by w = x\X2 • • • any spinned word such that 
Xi = Xi if Xi has spin and %i = Xi if X\ has spin 1. Such a word w is called a spinned version 
of w. 

Theorem 3.3. [731 Theorem 3.10] An infinite word t G A w is episturmian if and only if there 
exists a spinned infinite word A = xix^x^ • • ■ over AL)A and an infinite sequence (tW)i>o of 
recurrent infinite words such that 

t (0) =t and t (i_1) = Vx,(t (i) ) or t (i_1) = ^ 4 (t W ) for all i > 0, 

according to the spin or 1 of xi, respectively. 

For any epistandard word (resp. episturmian word) t and infinite word A (resp. spinned infinite 
word A) satisfying the conditions of the Theorem l3.1l (resp. Theorem l3.3p . we say that A (resp. A) 
is a directive word (resp. a (spinned) directive word) for t or t is directed by A (resp. A). 

Remark 3.4. It follows immediately from Theorem 13.31 that if t is an episturmian word directed 
by a spinned infinite word A, then each (as defined in the theorem) is an episturmian word 
directed by T n (A) = x n+1 x n+2 x n+3 

The following important fact links Theorems 13.11 and 13.31 

Remark 3.5. [73] If t is an episturmian word directed by a spinned version A of an infinite word 
A over A, then t is factor-equivalent to the (unique) epistandard word s directed by A. 

Moreover, with the same notation as in the above remark, the episturmian word t is periodic 
if and only if the epistandard word s is periodic, and this holds if and only if |Ult(A)| = 1 (see 
Remark 12.71 or Theorem 14.151 later) . 

Example 3.6. Consider the episturmian word m = baabacabab- ■ ■ directed by A = abc(abc) w . 
Observe that m is factor-equivalent to the Tribonacci word r, and we have 

m = ^ a (m (1) ) = ^ a ^ fe (m (2) ) = V5 a ^ c (m (3) ), 

where m^ 3 ^ is directed by T 3 (A) = (abc)^ , i.e., m^ 3 ) = r. 
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Example 3.7. We now consider an example where the condition that the t^*' in Theorem 13.31 
are recurrent is not satisfied. Let t = dr = dabacabaabacaba ■ ■ ■ where r is the Tribonacci word 
and d is a letter. Then t = ^(t^), = Vfo(t {2) ), t^ 2 ) = Vc(t (3) ), and so on; however, these tW 
are not recurrent (and t is not episturmian). The infinite word t = dr is actually an example of 
an episkew word, i.e., a non-recurrent infinite word having episturmian factors. Such words are 
discussed in more detail in Section [7.21 

Remark 3.8. Let us point out that the construction of epistandard words by palindromic closure 
(given in Theorem I2.3[) extends to all episturmian words: when x n = x n write x n on the left and 
use palindromic left-closure. Here m (from the above example) appears step by step on the right: 

a ■ 

a ■ ba 
abaca • ba 
abaca ■ baabacaba 

When an episturmian word is aperiodic, we have the following fundamental link between the 
words (t( n )) n >o and the spinned infinite word A occurring in Theorem 13. 3t if a n is the first 
letter of t^, then /ii r .i„(a n ) is a prefix of t and the sequence {Hx 1 ---x n { a n))n>\ is not ultimately 
constant (since A is not ultimately constant), then t = lim^-^x, /i£ r ..s& n (a n ). This fact is a slight 
generalization of a result of Risley and Zamboni |105[ Prop. III. 7] on S-adic representations for 
standard Arnoux-Rauzy sequences. See also the recent paper |23j for S-adic representations of 
Sturmian words. Note that S-adic dynamical systems were introduced by Ferenczi [50] as minimal 
dynamical systems (e.g., see (96]) generated by a finite number of substitutions. In the case of 
episturmian words, the notion itself is actually a reformulation of the well-known Rauzy rules, as 
studied in [98]. In fact, it is well known that the subshift of an aperiodic episturmian word t (i.e., 
the topological closure of the shift orbit of t) is a minimal dynamical system, i.e., it consists of 
all the episturmian words with the same set of factors as t. 

It is not hard to see that a morphism is episturmian (resp. epistandard) if and only if it 
preserves the set of aperiodic episturmian (resp. epistandard) words (see |73j). Even more: 

Theorem 3.9. [731 Theorem 3.13] A morphism ip is episturmian (resp. epistandard) if there exist 
strict episturmian (resp. epistandard) words m, t such that m = <p(t). □ 

Purely morphic episturmian words (i.e., those generated by morphisms) are discussed further 
in Section HJ where we consider the relationship between spins and the shifts that they induce. 
These ideas were used in [75] to obtain a complete answer to the question: if an episturmian 
word is purely morphic, which shifts of it, if any, are also purely morphic? (See Theorem 14.191 to 
follow.) Such rigidity issues are discussed in more detail in Sections 14.41 and 151 

In [75], Justin and Pirillo also made use of bi-infinite words, which often allow for more natural 
formulations. Indeed, the characterization (Theorem I3.3|) of right-infinite episturmian words by 
a sequence (tW)^>o extends to bi-infinite episturmian words, with all the now bi-infinite 
episturmian words. That is, as for right-infinite episturmian words, we have bi-infinite words of 
the form lW .rW where 1^ is a left-infinite episturmian word and r^ is a right-infinite episturmian 
word. Moreover, if the bi-infinite episturmian word b = l.r is directed by A with associated bi- 
infinite episturmian words t>W = lW.rW, then r is directed by A with associated right-infinite 
episturmian words r®. 
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4 Spins, shifts, and directive words 



In this section, we discuss in more detail the notion of spins, the shifts they induce, and the 
concept of block-equivalence in connection with directive words. These notions allow us to study 
in particular when two different spinned infinite words direct the same episturmian word. Indeed, 
as we shall see in Section |4T3] the correspondence between episturmian words and spinned directive 
words is not one-to-one. 

4.1 Notation for pure episturmian morphisms 

For a £ A, let fj, a = ip a and //„ = ijj a . This operator \i can be naturally extended (as done in [73] ) 
to a pure episturmian morphism: for any spinned finite word w = x\ ■ ■ ■ x n over A U A, we define 
Hfi, := fJ,x 1 • • ■ Hx n an d set fJ, E equal to the identity morphism Id. 

Viewing w = x\x 2 ■ ■ ■ x n as a prefix of the directive word A = xiX2X% • • • E A u , it is clear from 
Theorem 13. II that the words 

(Axi—Xn—ifan)) ^ — 1) 

are prefixes of the epistandard word s directed by A. 

Example 4.1. We observe that any epistandard word s £ A w has the form s = fi w (s') for some 
uniquely determined finite word w and strict epistandard word s'. Indeed, if A = X1X2X3 ■ ■ ■ £ A u 
is the directive word of s and m is the smallest positive integer such that Alph(x fn +ix m ,_|_2 • • • ) = 
Alph(A), then x\ ■ ■ ■ x m is the shortest prefix of A that contains all the letters not appearing 
infinitely often in A. Moreover, by Theorem 13. 1[ s = fi Xl ---xm( s ^ m ^) where s( m ) is the epistandard 
word directed by T m (A) = x m+ ix m+2 Since Ult(T m (A)) = Alph(T m (A)) by construction, 
the epistandard word s( m ) is strict. For example, with A = c{ab) ul , we have s = ^(s^ 1 )) where 
is directed by (ab) w , i.e., is the well-known Fibonacci word over {a, b}. 

For n > 1, let u n+ \ := Pal{x\ ■ ■ ■ x n ) and set u\ = e. Then by part iv) of Theorem 12.31 

the epistandard word s directed by A is given by s — lim n >OQ u n . We have the following useful 

formula from [73] : 

u i+ i = pixi-xi-i(xi)ui for i > 0. (4.1) 
For letters (^j)i<j<i, formula (|4.ip inductively leads to: 

iH+i = t^xi-Xi-Axi) • • • fi xi (x 2 )xi = //.,;..,,-, A-i-j). (4.2) 

i<i<i 

(Note that by convention, x\ ■ ■ ■ xq = e in the above product.) For example, with A = abcb • • • , 
we compute: 

U3 = Pal(abcb) = ^ a bc{b)^ a b{c)^ a {b)a = abacab ■ abac ■ ab ■ a. 

4.2 Shifts 

Now let iv = x\X2 ■ ■ ■ x n be a spinned version of w = x\x 2 ■ ■ ■ x n (viewed as a prefix of a spinned 
version A of A). Then, for any finite word v, we have 

Hti(v) = S^fJw^Stf where S a = fj Mzi-x.-i (4.3) 

i=n,...,l 

Observe that is a prefix of Pal(w); in particular Su, = Pal(w) by equation (14. 2p . Note also 
that /-iu,(v) = T^ s ^(fi w (v)). The word S^, is called the shifting factor of and its length \S^\ is 
called the shift induced by the prefix iv of A of length n [75] . 
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Example 4.2. If we take w = abca, then 

S-w = f J >abc( a ')f J 'a{b) = abacaba ■ ab. 

Thus since fi a bca(ca) = abacabaab ■ acabacaba, we have 

^abca( ca ) = T 9 ' (l^abcaica)) = acabacaba ■ abacabaab. 

Likewise, for any infinite word y G A w , fJ-w(y) = S^ ) 1 fi w (y). For example, if we take w = ab, 
then Stf, = Pal(ab) = aba, and hence /-^(y) = (aba) -1 fi a b(y) f° r anv infinite word y. 
Note. The morphisms and conjugate morphisms |99j. 

4.3 Block-equivalence & directive words 

By Theorem 12.31 (and also Theorem 13. ip , any epistandard word s G „4 W has a unique directive 
word over A, but s also has infinitely many other spinned directive words (see [731 [75l I63j). For 
example, the Tribonacci word is directed by (abc)^ and also by (abc) n abc(abc) UJ for each n > 0, 
as well as infinitely many other spinned words. The natural question: "does any spinned word 
direct a unique episturmian word?" was answered in [73j . 

Proposition 4.3. [73] 

1. Any spinned infinite word A having infinitely many letters with spin directs a unique 
episturmian word beginning with the left-most letter having spin in A. 

2. Any spinned infinite word A with all spins ultimately 1 directs exactly |Ult(A)| episturmian 
words. 

3. Let A be a spinned infinite word having all its letters with spin 1 and let a G Ult(A). Then 
A directs exactly one episturmian word starting with a. □ 

Note. The above statement corrects a small error in Proposition 3.11 of [73] where item 3 was 
stated in the more general case when A has all spins ultimately 1. In this case, A still directs 
exactly one episturmian word for each letter a in Ult(A), but contrary to what is written in [73], 
nothing can be said about its first letter. 

Block- equivalence for spinned words was introduced in [75] as a way of studying when A and 
A (two spinned versions of a directive word A) direct the same bi-infinite episturmian word. We 
do not recall the full details here, only a few notions relating to it. 

Notation. If v G A + , then v G A + is v with all spins 1. 

A word of the form xvx, where x G A and v £ (A \ {x})*, is called a (x-based) block. A (x- 
based) block-transformation is the replacement in a spinned word of an occurrence of xvx (where 
xvx is a block) by xvx or vice-versa. Two finite spinned words w, w' are said to be block- equivalent 
if we can pass from one to the other by a (possibly empty) chain of block-transformations, in 
which case we write w = w' . For example, babcbac and babcbac are block-equivalent because 
babcbac — > babcbac — ► babcbac and vice-versa. Note that if w = w' then w and w' are spinned 
versions of the same word over A. Block-equivalence extends to (right-)infinite words as follows. 

Let Ai, A2 be spinned versions of A. We write Ax ~» A2 if there exist infinitely many 
prefixes /j of A\ and gi of A2 with the gi of strictly increasing lengths, and such that, for all i, 
\di\ < \fi\ and fi = gici for a suitable spinned word q. Infinite words Ai and A2 are said to be 
block- equivalent (denoted by Ai = A2) if Ai ~> A2 and A2 Ai. 
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Remark 4.4. If x is a letter and v G A* is a>free, then xvx and xvx are block-equivalent and they 
induce the same shift, i.e., fi&vx = V-xvx |75} Theorem 2.2]. Thus the monoid of pure episturmian 
morphisms, S p , is isomorphic to the quotient of (A U *4)* by the block-equivalence generated by 

{xvx = xvx | x G A, v is x-free}. 

Note that this has some relation to the study of conjugacy and episturmian morphisms carried 
out by Richomme [99] . 

From what we have already learned about bi-infinite episturmian words (in Sections 12.21 and 
13. 2[) . it is clear that Justin and Pirillo's results about spinned infinite words directing the same 
bi-infinite episturmian word are still valid for words directing the same (right-infinite) episturmian 
word. Roughly speaking, two spinned infinite words direct the same episturmian word if and only 
if they are block-equivalent. For instance, we have the following results for wavy spinned versions 
of A G A u '. A spinned version A of A is said to be wavy if A contains infinitely many letters of 
spin and infinitely many letters of spin 1. 

Theorem 4.5. |75l Theorem 3.4] Suppose A and A are wavy versions of A G A u with |Ult(A)| > 
1 . Then A and A direct the same episturmian word if and only if A = A . □ 

For example, 6a(6ca) w and ba^{cab) u) direct the same episturmian word, namely n ba i c (cr) 
(= fiiabc( cr )) where r is the Tribonacci word. 

Theorem 4.6. [75} Prop. 3.6] Let A, A be two spinned versions of A G A u with |Ult(A)| > 1, 
A wavy, and A having all spins ultimately or 1. If A and A direct the same episturmian word, 
then A ~» A. □ 

Similar results also hold when all spins are ultimately or 1 and in the periodic case. See 
Propositions 3.7 and 3.10 in |75j . 

Remark 4.7. In [75], the study of block-equivalence for finite spinned words led to numeration 
systems that resemble the Ostrowski systems [20] associated with Sturmian words. A matrix 
formula for computing the number of representations of an integer in such a system was also 
given in [75l Section 2]. 

More recently, Glen, Leve, and Richomme [63] established the following complete character- 
ization of pairs of spinned infinite words directing the same unique episturmian word. Not only 
does the following theorem provide the relative forms of two spinned infinite words directing the 
same episturmian word, but it also fully solves the periodic case, which was only partially solved 
in [75]. 

Theorem 4.8. [63] Given two spinned infinite words Ai and A2, the following assertions are 
equivalent. 

i) Ai and A2 direct the same right-infinite episturmian word. 

ii) One of the following cases holds for some i,j such that {i,j} = {1,2}: 

1. Ai = Y\ n> iV n , Aj = Y\ n>1 z n where (v n ) n >i, (z n ) n >i are spinned words such that 
fJ-vn = fJ-z„~for all n > 1; 
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2. Aj = wx\\ n>l v n x n , Aj = w' x\\ n>l v n x n where w, w' are spinned words such that 

letter, (v n ) n >\ is a sequence of non- empty x-free words, and (x n )n>i, 
{x n )n>i are sequences of non-empty spinned words over {x,x} such that, for all n > 1, 

\%n\ — \%n\ and \x n \ x — |^ro|:r? 

3. Ai = tox and A2 = w'y where w, w' are spinned words, x and y are letters, and 
x € {x,x\ u , y £ {y,^} 1 ^ are spinned infinite words such that fj, w (x) = /J, w '(y). 

□ 

In items [T] and [2] of Theorem 14.81 the two considered directive words are spinned versions of 
the same infinite word. This does not hold in item [3l which concerns only periodic episturmian 
words. In particular, we observe the following: 

Remark 4.9. If an aperiodic episturmian word is directed by two different spinned infinite words 
Ai and A2, then Ai and A2 are spinned versions of the same word A. 

As an example of item [31 one can consider the periodic episturmian word (6c6a) w which is 
directed by both bca^ and bac u (since //& c (&) — Mfts( c ))- Note also that (bcba) u is epistandard and 
has the same set of factors as the epistandard word {babc)^ directed by ba& ' . Actually, in view 
of Remark 13.51 we observe the following: 

Remark 4.10. The subshift of any aperiodic episturmian word contains a unique (aperiodic) 
epistandard word, whereas the subshift of a periodic episturmian word contains exactly two (pe- 
riodic) epistandard words, except if this word is a u with a a letter. 

We also observe that x and y can be equal in item [3] of Theorem 14. 8( for example {ab) u is 
directed by abb u and by ab w . 

Example 4.11. |63j For a, b, c three different letters in A, the spinned infinite words Ai = a(6ca) w 
and A2 = a(6m) w direct the same episturmian word that starts with the letter a. Indeed, these 
two directive words fulfill item 2 of Theorem 14.81 with w = w' = e, x = a, and for all n, v n = be 
and x n = x n = a. Moreover the fact that Ai starts with the letter a shows that the word it 
directs starts with a. Similarly A' x = o6(ca6) w and A' 2 = o6(ca6) w direct the same episturmian 
word starting with the letter b. Since A2 = A 2 , this shows that the relation "direct the same 
episturmian word" over spinned infinite words is not an equivalence relation. 

Items [2] and [3] of Theorem 14.81 show that any episturmian word is directed by a spinned infinite 
word having infinitely many letters of spin 0, but also by a spinned word having both infinitely 
many letters of spin and infinitely many letters of spin 1 (i.e., a wavy word). To emphasize the 
importance of these facts, let us recall from Proposition 14.31 that if A is a spinned infinite word 
over A. U A. with infinitely many letters of spin 0, then there exists a unique episturmian word t 
directed by A. Unicity comes from the fact that the first letter of t is fixed by the first letter of 
spin in A. We also note that if an episturmian word t has two directive words satisfying items 
[2] or [3l then t has infinitely many directive words (this was shown in [63j ) . 

When studying repetitions in Sturmian words, Berthe, Holton, and Zamboni [23] proved that 
any Sturmian word has a unique directive word over {a, b, a, 6} containing infinitely many letters of 
spin 0, but no factor of the form ab n a or ba n b with n an integer. Leve and Richomme [80J recently 
generalized this result to episturmian words by introducing a way to 'normalize' the directive 
word(s) of an episturmian word so that any episturmian word can be defined uniquely by its 
so-called normalized directive word, defined by some factor avoidance, as follows. This idea has 
since proved useful in the study of quasiperiodic episturmian words (see Section [HJ); in particular, 
it provides an effective way to decide whether or not a given episturmian word is quasiperiodic. 
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Theorem 4.12. |63U80| Any episturmian word t E A? has a spinned directive word A containing 
infinitely many letters of spin 0, but no factor in \J a( -_ A aA*a. Such a directive word is unique if 
t is aperiodic, in which case A is called the normalized directive word for t. □ 

Note. Unicity does not necessarily hold for periodic episturmian words. For example, the periodic 
episturmian word (ab) u = i a (& t ") = ipb(a UJ ) is directed by ab^ and by ba^ (since ip a (b) = ab = 
A{a)). 

The following result tells us precisely which episturmian words have a unique directive word. 

Theorem 4.13. [63] An episturmian word t E A? has a unique directive word if and only if the 
(normalized) directive word oft contains 1) infinitely many letters of spin 0, 2) infinitely many 
letters of spin 1, 3) no factor in [J a£ _ A aA*a, and 4) no factor in [J a ^aA*a. Such an episturmian 
word is necessarily aperiodic. □ 

For instance, a particular family of episturmian words having unique directive words consists 
of those directed by regular wavy words [58, 64J, i.e., spinned infinite words having both infinitely 
many letters of spin and infinitely many letters of spin 1 such that each letter occurs with the 
same spin everywhere in the directive word. More formally, a spinned version w of a finite or 
infinite word w is said to be regular if, for each letter x E Alph(io), all occurrences of x in w have 
the same spin (0 or 1). For example, the regular wavy word (abc) u is the unique directive word 
for the episturmian word ar = aabacabaabacab ■ ■ ■ where r is the Tribonacci word. 

In the Sturmian case, we have: 

Proposition 4.14. [63J Any Sturmian word has either a unique spinned directive word or in- 
finitely many spinned directive words. Moreover, a Sturmian word has a unique directive word if 
and only if its (normalized) directive word is regular wavy. □ 

As pointed out in [63], Proposition 14. 141 shows a great difference between Sturmian words and 
episturmian words constructed over alphabets with at least three letters. Indeed, when considering 
words over a ternary alphabet, one can find episturmian words having exactly m directive words 
for any m > 1. For instance, the episturmian word t directed by A = a(ba) m ~ 1 bc(abc) UJ has 
exactly m directive words, namely (ab) 1 a(6a) J 'bc^abc)^ with i + j = m — 1. Notice that the suffix 
bc(abc) ljj of A is regular wavy, and the other m — 1 spinned versions of A that also direct t arise 
from the m — 1 words that are block-equivalent to the prefix a(ba) m ~ 1 . 



4.4 Periodic and purely morphic episturmian words 

We are now ready to describe periodic and purely morphic episturmian words. 

Recall from Remark 12.71 that the periodic episturmian words correspond to |Ult(A)| = 1. The 
following theorem gives the form of such words in terms of pure episturmian morphisms. 

Theorem 4.15. [73] An episturmian word is periodic if and only if it is (n^(x)) u for some 
spinned finite word w and letter x. □ 

For example, (^ a i(c)) u = (acab) 1 ^ is the periodic episturmian word directed by abc u (in fact, 
it is epistandard as it is also directed by acb 1 ^). 

The next theorem characterizes purely morphic episturmian words with respect to their di- 
rective words. 
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Theorem 4.16. [73[ Theorem 3.14] An aperiodic episturmian word is purely morphic (i.e., gen- 
erated by a morphism) if and only if it is directed by a periodic spinned infinite word A = (f) u 
for some spinned word f. Moreover it can be generated by □ 

We observe from Theorem 14.161 that any purely morphic episturmian word is strict (i.e., an 
Arnoux-Rauzy sequence) as Ult(A) = Alph(/) = Alph(A). The proof of this theorem makes use 
of Proposition 14.31 and Theorem 13.91 

Example 4.17. The Tribonacci word is generated by Habc- Notice that fi a b c = cr 3 where a is the 
Tribonacci morphism defined by cr : (a,b,c) \— > (ab,ac,a). 

Remark 4.18. Purely morphic standard Sturmian words were previously characterized indepen- 
dently in the following papers: [16U38[ I77]. Yasutomi [118] has since established a characterization 
of all purely morphic Sturmian words with respect to their slopes and intercepts (when viewed 
as cutting sequences). An alternative geometric proof of Yasutomi's result was recently given by 
Berthe et al. in [21] . 

Using the notion of block-equivalence, Justin and Pirillo [75] explicitly determined which shifts, 
if any, of a purely morphic episturmian word are also purely morphic. 

Theorem 4.19. [75] If an episturmian wordt is purely morphic, then its shiftT l (t) is also purely 
morphic if and only if i belongs to some particular interval. □ 

See Section 4 of [75] for specific (and very technical) details. 

Example 4.20. For the Tribonacci word r, only itself and T _1 (r) are purely morphic. Note that 
T _1 (r) corresponds to three episturmian words: ar, br, cr, directed by (a6c) w , (abc)^ , (abc) 1 ^, 
respectively. 

Remark 4.21. Theorem 14. 191 corrects an error in |73[ Section 5.1] where it was mistakenly said 
that if an episturmian word is purely morphic then any shift of it is also purely morphic. Indeed, 
this is false even in the Sturmian case as Fagnot 08] has shown that if s is a purely morphic 
standard Sturmian word on {a, &}, then as, 6s, abs, bas (which are purely morphic [17) ) are the 
only purely morphic Sturmian words related to s by a shift. 

5 Arnoux-Rauzy sequences 

We now briefly turn our attention to Arnoux-Rauzy sequences since their combinatorial properties 
are also considered in the sections that follow. 

Arnoux-Rauzy sequences are uniformly recurrent infinite words over a finite alphabet A with 
factor complexity (|*4| — l)n + 1 for each n 6 N, and exactly one right and one left special factor 
of each length. They were introduced by Arnoux and Rauzy [97[ 112] . who studied them using 
Rauzy graphs, with particular emphasis on the case |^4| = 3. (Note that the foregoing definition 
is equivalent to the one given in the introduction.) 

As mentioned previously (in Section I2.3[) . Arnoux-Rauzy sequences are exactly the strict 
episturmian words; in particular, any episturmian word has the form <p(t) with ip an episturmian 
morphism and t an Arnoux-Rauzy sequence. In this sense, episturmian words are only a slight 
generalization of Arnoux-Rauzy sequences. For example, the family of episturmian words on 
three letters {a,b,c} consists of the Arnoux-Rauzy sequences over {a,b,c}, the Sturmian words 
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over {a, b}, {b, c}, {a, c} and their images under episturmian morphisms on {a, b, c}, and periodic 
infinite words of the form ^p{x) w where (p is an episturmian morphism on {a, b, c} and x € {a, b, c}. 

Arnoux-Rauzy sequences have deep properties studied in the framework of dynamical systems, 
with connections to geometrical realizations such as Rauzy fractals and interval exchanges. 
When \A\ = 3, the condition on the special factors distinguishes Arnoux-Rauzy sequences from 
other infinite words of complexity 2n+l, such as those obtained by coding trajectories of 3- interval 
exchange transformations (e.g., see [51]). In [12], it was shown how Arnoux-Rauzy sequences 
of complexity 2n + 1 (i.e., the 3-strict episturmian words) can be geometrically realized by an 
exchange of six intervals on the unit circle, which generalizes the representation of Sturmian 
sequences by rotations. 

An alternative way of introducing and studying Arnoux-Rauzy sequences is in the context of 
S-adic dynamical systems, as done in [105] for instance (see our remarks following Theorem 13.31 
in Section l3.2p . In [40], Damanik and Zamboni give a kind of survey on this approach by con- 
sidering Arnoux-Rauzy subshifts and answering various combinatorial questions concerning linear 
recurrence, maximal powers of factors, and the number of palindromes of a given length. They 
also present some applications of their results to the spectral theory of discrete one-dimensional 
Schrodinger operators with potentials given by Arnoux-Rauzy sequences. 

Arnoux-Rauzy sequences also have interesting arithmetical properties. For instance, if one 
considers the frequencies of letters (as discussed later in Section [6.4h . they are well-defined, and 
renormalization by an episturmian morphism leads to a generalization of the continued fraction 
algorithm that associates to each /c-letter Arnoux-Rauzy sequence an infinite array of k x k rational 
numbers. In the special case k = 2, these fractions are consecutive Farey numbers arising from 
the continued fraction expansion of the frequencies of the two letters. More generally, given an 
Arnoux-Rauzy sequence on fc-letters, its directive word is determined by the 'multi-dimensional' 
continued fraction expansion of the frequencies of the first k — 1 letters. Unfortunately, this 
generalized algorithm (except for the case k = 2 when it is exactly the usual continued fraction 
algorithm) is only defined on a set of measure zero in M fc_1 . This reduces its interest and explains 
why it has not been appropriately studied since its inception (see Sections 16.2.11 and 16.41 for further 
details). Nonetheless, a nice arithmetical characterization of 3-letter Arnoux-Rauzy sequences can 
be given, as follows. We say that a triple (a, b, c) does not satisfy the triangular inequality if one 
of the coordinates is larger than the sum of the other two (e.g., a > b + c). In that case, we 
can renormalize in a unique way to obtain the triple (a — b — c,b, c) satisfying the triangular 
inequality. The set of allowable frequencies for 3-letter Arnouxy-Rauzy sequences is exactly the 
set of triples (a, b, c) that can be infinitely renormalized, each time to a triple that does not satisfy 
the triangular inequality (see [E]). The resulting picture exhibits a kind of Sierpinski carpet. 

For further details on Arnoux-Rauzy sequences, we refer the reader to the interesting survey 
[22j in which Berthe, Ferenczi, and Zamboni discuss connections between Arnoux-Rauzy sequences 
and rotations of the 2-torus; coding of two-dimensional actions and two-dimensional Sturmian 
words; and interval exchanges and sequences of low complexity. See also [35], Section 12.2.3 in 
[96J, and J. Berstel's nice survey paper [15] in which he compares some combinatorial properties 
of Arnoux-Rauzy sequences (as well as episturmian words) to those of Sturmian words. 

5.1 Finite Arnoux-Rauzy words 

A finite word w is said to be finite episturmian if w is a factor of some infinite episturmian 
word. When considering factors of (infinite) episturmian words, it suffices to consider only the 
strict standard ones (i.e., the standard Arnoux-Rauzy sequences). Indeed, for any prefix u of an 
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epistandard word, there exists a strict epistandard word also having u as a prefix. In particular, 
the words [J, w (x), with w £ A* and x £ A, are the standard ones (cf. standard words, e.g., [831 
Chapter 2]). They can be obtained by the Rauzy rules [SB] (see also [33l Theorem 8]), and this 
has a strong connection with the set of periods of the palindromes u n+ \ = Pal(x\ ■ ■ ■ x n ) (given in 
Theorem 12. 3p and the Euclidean algorithm. This relation was studied by Castelli, Mignosi, and 
Restivo [34] . who extended the well-known Fine and Wilf Theorem |82j to words having three 
periods. Justin [70] generalized this result even further to words having an arbitrary number of 
periods, which led to a characterization of finite episturmian words. 

Finite episturmian words are exactly the finite Arnoux-Rauzy words. Such words were enu- 
merated by Mignosi and Zamboni [88], who described a multi-dimensional generalization of the 
Euler phi-function that counts the number of finite Arnoux-Rauzy words of each length. Finite 
episturmian words have also been characterized with respect to lexicographic orderings in [62J 
(see Theorem 17.51 later). 

6 Some properties of factors 

6.1 Factor complexity 

As mentioned previously, any fe-strict episturmian word has complexity (k — l)n + 1 for all n £ N. 
More generally: 

Theorem 6.1. [431 Theorem 7] Suppose t is an episturmian word directed by A with |Ult(A)| > 1. 
Then, for n large enough, t has complexity (k — l)n + q for some q £ N + ; where k = |Ult(A)|. □ 

This theorem can be easily deduced from the fact that for sufficiently large n, any left special 
factor of t of length at least n has exactly k = |Ult(A)| different left extensions in t (by Theorem 6 
in 03]). 

6.2 Palindromic factors 

The palindromic complexity of episturmian words was established in [73] by carrying out a similar 
study to the one for Sturmian words in [44] . 

Theorem 6.2. [731 Theorem 4.4] If t is an A-strict episturmian word, then there exists exactly 

• one palindrome of length n for all even n, 

• one palindrome of length n and centre x for all odd n and x £ A. □ 

As shown in [33], the above property is characteristic in the Sturmian case, but not when A 
contains more than two letters because it also holds for billiard words, which are not episturmian 
(see Borel and Reutenauer [25]). 

Theorem 6.3. |731 Section 4.2] If t is episturmian, then there exist |Ult(A)| + 1 bi-infinite 
episturmian words of the form mm and mrm with x £ Ult(A) giving the palindromic factors of 
t. The spinned versions of A directing these bi-infinite episturmian words can be easily constructed 
via a simple algorithm. □ 

For more precise technical details, see Section 4.2 in [73] . 

Example 6.4. For the Tribonacci word, f.r is directed by {abcabcabcabc) U) . 
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6.2.1 Iterated palindromic closure 



In [105] . Risley and Zamboni gave an alternative construction of the sequence (u n ) n >i of palin- 
dromic prefixes of an epistandard word (where U\ = e and Uj + i = Pal(x\ ■■■Xj) for all i > 1), 
using a 'hat operation' as opposed to palindromic closure. The so-called hat operation is defined 
as follows. We construct a new alphabet A' := A U A where A = {x \ x € .A} and denote by 4> 
the morphism <j) : A' — > A defined by (fi(x) = 4>(x) = x for all letters x E A. The morphism (j> 
extends to a morphism (also denoted by <p) from words over A' to words over A. Now, from a 
given directive word A = X1X2X3 ■ ■ ■ £ A u , we construct a sequence of words {pi)i>\ as follows. 
We begin with p\ = e and pi = x\. Then, for n > 2, p n +i is obtained from p n according to the 
rule: if x n does not occur in p n , then p n +i = p n Xn<fi(Pn)', otherwise 

Pn+i — PnXn'Pi^n) i where s n 
is the longest palindromic suffix of p n containing no occurrence of x n . 

Example 6.5. Let A = (abc) u . Then using the hat operation, we obtain: 

Pi = e 

p 2 = a 

P3 = aba 

P4 = abacaba 

p$ = abacabaabacaba 

Pe = abacabaabacababacabaabacaba 

Now removing all hats (by applying (ft), we see that the p^s are precisely the palindromic prefixes 
of the Tribonacci word: abacabaabacababacabaabacaba 

As demonstrated by the above example, the hat operation is clearly the same as iterated 
palindromic closure; in fact, the relationship between these two constructions is evident by for- 
mula (|4,ip . which we now rewrite as: 

Pal(xi ■ ■ ■ x n ) = fixv-xn-t (x n )Pal(xi ■ ■ ■ x n -i) for n > 0. 

The above formula is actually a special case of formula (3) from |71j . which also happens to be 
formula (3) in [73], namely: 

Pal{yw) = fi v (Pal(w))Pal(v) for any words w, v. (6-1) 

This formula is commonly referred to as Justin's Formula, from which we deduce the following 
two special cases: 

Pal(xw) = ip x (Pal(w))x and Pal(wx) = ^ w (x)Pal(w) for any word v and letter x . (6.2) 

The first formula given in (|6.2p tells us that Pal{xw) is obtained from Pal{w) simply by inserting 
the letter x before each letter different from x and then appending x to the resulting word. For 
example, Pal(bc) = bcb and Pal(abc) = abacaba. The second formula given in (|6.2[) provides 
another way to compute the palindromic right-closure of wx by placing the finite epistandard 
word fJ. w (x) in front of Pal(w). For example, to compute Pal(abcb) we need only compute the 
words Habc(b) = abacab and Pal(abc) = abacaba, and then we have: 

Pal{abcb) 

— fi'abc( a )PQ'l( a bc) — abacab ■ abacaba. 
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In [71], Justin established some relations between the words Pal(w), fi w , Pal{w), and ^ where 
w is any finite word. Moreover, he showed that his results can be explained by the similarity of the 
incidence matrices of \i w and fj,^. One curious result is that \Pal(w)\ = \Pal(w)\. For example, 
with w = abac, Pal{w) = abaabacabaaba and Pal(w) = cacbcacacacbcac, both of length 15. 

Applying his results to a 2-letter alphabet, Justin [71] gave a new proof of a Galois theorem 
on continued fractions, by considering the epistandard words that are fixed points of \x w and 

for any finite word w. From this point of view, Justin's result highlights the relevance of 
the previously mentioned 'multi-dimensional' continued fraction algorithm, proposed by Zamboni 
[1191 H17j (see also |96[ Section 12.2]). However, there still remains much work to be done in 
this direction, especially concerning the generalized intercept (coherent with the Sturmian case) 
introduced in [73} Section 5.4] and the generalized Ostrowski numeration systems [201 175] (recall 
Remark |4T7|). 

Note. The aforementioned Galois theorem was used in the theory of Sturmian words to charac- 
terize so-called Sturm numbers (see [831 Theorem 2.3.26]). 

6.2.2 Palindromic richness 

In [33], Droubay, Justin, and Pirillo observed that any finite word w contains at most + 1 
distinct palindromes (including the empty word). Even further, they proved that a word w 
contains exactly + 1 distinct palindromes if and only if the longest palindromic suffix of any 
prefix p of w occurs exactly once in p (i.e., every prefix of w has Property Ju [43]). Such words are 
'rich' in palindromes in the sense that they contain the maximum number of different palindromic 
factors. Accordingly, we say that a finite word w is rich if it contains exactly |io| + 1 distinct 
palindromes (or equivalently, if every prefix of w has Property Ju). For example, abac is rich since 
it is of length 4 and contains the following five palindromes: e, a, b, c, aba. Naturally, an infinite 
word is rich if all of its factors are rich. For example, the periodic infinite words a w = aaa ■ ■ ■ and 
(ab)^ = ababab ■ ■ ■ are clearly rich, whereas {abc) w = abcabacabc ■ ■ ■ is not rich since it contains 
the non-rich word abca. 

Droubay et al. |43j showed that all finite and infinite episturmian words are rich. Specifically, 
they proved that if an infinite word has property Pi (and hence is epistandard - see Theorem 
12. 3p . then all of its prefixes have property Ju. Consequently, any factor u of an epistandard word 
(and hence, of an episturmian word) contains exactly \u\ + 1 distinct palindromes, and is therefore 
rich (see Corollary 2 in [43] ) . 

Another special class of rich words the encompasses the episturmian words consists of Fischler's 
sequences with "abundant palindromic prefixes" . These words were introduced and studied in |54[ 
[55] in the context of Diophantine approximation. See also papers by Adamczewski and Bugeaud 
[2] [3] concerning the transcendence of certain real numbers whose sequences of partial quotients 
contain arbitrarily long palindromes. 

The theory of rich words has recently been further developed in a series of papers [611 [29] 142] 
130] . In independent work, Ambroz, Frougny, Masakova, and Pelantova [8] have considered the 
same class of words which they call full words, following the earlier work of Brlek, Hamel, Nivat, 
and Reutenauer in [26] . 

6.3 Fractional powers & critical exponent 

The study of fractional powers occurring in Sturmian words has been a topic of growing interest 
in recent times. See for instance [14 1, [231 [39] 172 ] [86] 1111] . as well as [73" jll05[ l59] for similar results 
concerning episturmian words and Arnoux-Rauzy sequences. 
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The following theorem extends the results in [72] on fractional powers in Sturmian words. 
Throughout this section, we let s denote an epistandard word with directive word A = x\X2 ■ ■ ■ G 
A^ (as usual), and for all n > 1, we denote by the palindromic prefix Pal{x\ ■ ■ ■ x n ) of s given 
in Theorem 12.31 As in [72], we denote by L(m) the length of the longest factor v G F(s) having 
period m G N, and write L{m) = em + r, e G N + , < r < m. Given a finite or infinite word w, 
we denote by w(i) (resp. w(i,j)) the letter in position i of w (resp. the factor of w beginning at 
position i and ending at position j). 

When L(m) > 2m, all factors of s having period m and length L(m) are equal to a palindrome 
v, and for < i < e, the word := f (1, im + r) is a palindromic prefix of s by Lemma 4.1 in [73] . 
Moreover, with the preceding notation, we have: 

Theorem 6.6. [731 Theorem 4.2] Let m, n G N be such that \u n \ < m < |u n +i| and s(l,m) = w 
is primitive with s(m) = x occurring in s(l,m — 1). Then the following properties hold. 

i) L(m) > 2m i/ and on/?/ if w = fi Xl ... Xn (x) and x G Alph(x n+ ix n +2 • • • )• 

ii) Suppose L(m) > 2 and define p = m&x{i < n \ Xi = x} and t = min{j G N + | x n +j ^ 
x}. Then u n+ \ = w l u p is the longest prefix of s having period m. Moreover, if x G 
Alph(x n _|_t_|_ix n _|_£_|_2 • • • ), then e = t + 1; that is, v = w t+1 u p , otherwise e = t and v = 
w t u p . □ 

Remark 6.7. Let us mention a few noteworthy facts. 

• Exponents of powers in s are bounded if and only if exponents of letters in A are bounded [105 , 

Eg. 

• Any Sturmian word has square prefixes and so do epistandard words [U 1105] . 

• Any episturmian word has infinitely many prefixes of the form uv 2 with | bounded 
above. 

The latter fact is readily deduced from the following result of Risley and Zamboni [105] . 

Theorem 6.8. [1051 Prop. 1.3] If t is an Arnoux-Rauzy sequence, then there exists a positive 
number e such that t begins with infinitely many blocks of the form UVVV' , where V is a prefix 
ofV and mm{\V'\/\V\,\V\/\U\} > e. □ 

Note. Such a result is motivated by transcendence issues; see for instance [52] . 

When s is purely morphic, it is possible to give a rather explicit formula for the critical 
exponent: 7 = limsup,^^ L(m)/m, as follows. 

Notation. Let P be the function defined by P(n) = sup{i < n \ xi = x n } if this integer exists, 
undefined otherwise. That is, if x n = a, then P(n) is the position of the right-most occurrence of 
the letter a in the prefix X1X2 ■ ■ ■ x n -i of the directive word A = x\X2Xz ■ ■ ■ G A u . 

Theorem 6.9. |73} Theorem 5.2] Let s be an A-strict epistandard word generated by a morphism 
with directive word A having period q. Further, let I G N be maximal such that y l G .F(A) for some 
letter y, and define L = {r, < r < q \ x r+ \ = x r+ 2 = ■ ■ ■ = x r+ {\ and d(r) = r + q+1 — P{r + q— 1) 
for < r < q. Then the critical exponent for s is given by 



7 = I + 2 + sup <^ lim \u r+iq+1 _ d{r) \/\h r+iq \ \ 



Moreover, for any letter x ins the limit above can be obtained as a rational function with rational 
coefficients of the frequency a x of this letter. □ 
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See also |86t 11071 HHj for results on the critical exponent for the Fibonacci word, Tribonacci 
word, and Sturmian words, respectively. 

Example 6.10. For the ever-so popular Fibonacci word f, directed by (ab) w , we have q = 2, 
/ = 1, d(0) = d(l) = 2. Hence, since |u n _i|/|/i n | has limit 1/ip where tp = (l + \/5)/2 is the golden 
ratio, we obtain the well-known value 2 + cp for the critical exponent, originally proved by Mignosi 
and Pirillo [SB]. 

More generally, the /c-bonacci word, directed by (aia2 • • • afc) aJ , has critical exponent 2+1/ (tpk — 
1), where the k-bonacci constant (pj~ is the (unique) positive real root of the k-th. degree monic 
polynomial x k — — ■ ■ ■ — x — 1. 

6.4 Frequencies 

Let ujbea non-empty finite word. For any v S F(w), the frequency of v in w is where \w\ v 

denotes the number of distinct occurrences of v in it?. The notion of frequency can be extended 
to infinite words in two ways, as follows. 

Definition 6.11. Suppose v is a non-empty factor of an infinite word x. Then: 

i) the frequency of v in x in the weak sense is limn—^ \w(l,n)\ v /n if this limit exists; 

ii) v has frequency a v in x in the strong sense if for any sequence (w n ) n >o of factors o/x with 
increasing lengths, we have a v = linin^oo |w n |i;/|^n|- 

In a purely combinatorial way, Justin and Pirillo \73\ Section 6] proved that any factor occur- 
ring in an episturmian word has frequency in the strong sense. 

Wozny and Zamboni |117j also studied frequencies (in the weak sense) for Arnoux-Rauzy se- 
quences. Using a reformulation of a vectorial division algorithm, originally introduced in [105] , 
they computed each allowable frequency of factors of the same length, as well as the number of 
factors with a given frequency. In particular, the authors of [11 7j gave simultaneous rational ap- 
proximations of the frequencies by unreduced fractions having a common denominator. From this 
work, one recovers the results of Berthe |19] for Sturmian words in terms of Farey approximations 
arising from the continued fraction expansions of the frequencies of the letters. For instance, the 
frequencies of factors of the same length in a Sturmian word assume at most three values, which 
were explicitly given by Berthe [19], who also discovered that this result is in strong connection 
with the three distance theorem in Diophantine analysis. 

6.5 Return words 

Let us now recall the notion of a return word, which was introduced independently by Durand 
[45], and Holton and Zamboni [67] when studying primitive substitutive sequences. 

Definition 6.12. Let v be a recurrent factor of y G A u , starting at positions n\ < n2 < n% ■ ■ ■ . 

Then each word Ti = y ni y ni +\ ■ ■ ■ y ni+1 -i is called a return to v in y. Moreover, y can be factorized 
in a unique way as y = y\ • ■ ■ y ni -i r i r 2 r 3 ■ ■ ■ where ri^r^ ■ ■ ■ , viewed as a word on the ri, is called 
the derived word of y with respect to v. 

That is, a return to v in y is a non-empty factor of y beginning at an occurrence of v and 
ending exactly before the next occurrence of v in y. Thus, if r is a return to v in y, then rv is 
a factor of y that contains exactly two occurrences of v, one as a prefix and one as a suffix. We 
call rv a complete return to v [76]. 
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Return words play an important role in the study of minimal subshifts in symbolic dynamics; 
see for instance |45 [ 146 } l4"? | I53U106] . In the context of episturmian words, such words have recently 
proven to be a useful tool in the study of quasiperiodicity (see Section [8] for further details). This 
latest work made use of the following result of Justin and Vuillon [76] which completely describes 
the returns to any factor of an epistandard word. In fact, their result actually characterizes 
return words in episturmian words (not just epistandard words) since, by uniform recurrence, the 
returns to any factor v in an epistandard word s are the same as the returns to v as a factor of 
any episturmian word t having the same set of factors as s. 

Theorem 6.13. [76] Let s be an epistandard word directed by A = x\X2X^ ■ ■ ■ G A w and consider 
any v £ F(s). If u n+ i is the shortest palindromic prefix of s containing v with u n+ i = fvg, then 
the returns to v in s are the words f~ x n (x)f where x € Alph(x n +ix n+ 2 ■ ■ • )■ Moreover, the 
corresponding complete returns to v are the words f~ 1 (u n+ ix)^g~ 1 and the derived word of s 
with respect to v is given by = fJ>xi—xi s )- ^ 

Note. It follows immediately that any factor of an ^.-strict episturmian word has exactly \A\ 
return words. 

Theorem 16.131 extends earlier work of Vuillon on return words in Sturmian words (see |114j ) . 
In particular, Vuillon proved that Sturmian words are characterized by the property that any 
non-empty factor has exactly 2 different return words in the given Sturmian word. However, 
contrary to what one might expect, such a property with 2 replaced by a positive integer k > 3 
does not characterize /c-strict episturmian words. For instance, infinite words coding 3-interval 
exchange transformations, which constitute a different generalization of Sturmian words to 3- 
letter alphabets, are known to have the property that every factor has 3 different return words 
(see the work by Ferenczi, Holton, and Zamboni in [51]). 

7 Balance &; lexicographic order 
7.1 g-Balance 

Definition 7.1. A finite or infinite word is g-balanced if, for any two of its factors u, v with 
\u\ = \v\, we have 

\\ u \x — \ v \x\ < Q for any letter x, 
i.e., the number of x's in each of u and v differs by at most q. 

Note. A 1-balanced word is simply said to be balanced. 

The term 'balanced' is relatively new; it appeared in [161117] (also see |83[ Chapter 2]), and the 
notion itself dates back to [91} [37] . In the pioneering work of Morse and Hedlund [91] , balanced 
infinite words over a 2-letter alphabet were called 'Sturmian trajectories' and belong to three 
classes: aperiodic Sturmian; periodic Sturmian; and infinite words that are ultimately periodic 
(but not periodic), called skew words. That is, the family of balanced infinite words consists of 
the (recurrent) Sturmian words and the (non-recurrent) skew infinite words, the factors of which 
are balanced. Skew words are ultimately periodic suffixes of words of the form n{aPba w ), where 
fj, is a pure standard Sturmian morphism and p E N. For example, aba^ and ijjb^aba^) = bab^ba)^ 
are skew. See also [1081 1109[ [66"1 [95] for further work on skew words. 

Remark 7.2. Nowadays, for most authors, only the aperiodic Sturmian words are considered to 
be 'Sturmian'. However, from now on, we will use the term 'Sturmian' to refer to both aperiodic 
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and periodic Sturmian words. In the context of cutting sequences, the aperiodic (resp. periodic) 
Sturmian words are precisely those with irrational slope (resp. rational slope). 

It is important to note that a finite word is finite Sturmian (i.e., a factor of some Sturmian 
word) if and only if it is balanced [83} Chapter 2]. Accordingly, the balanced infinite words 
are precisely the infinite words whose factors are finite Sturmian. This concept was recently 
generalized in [62] by showing that the set of all infinite words whose factors are finite episturmian 
consists of the (recurrent) episturmian words and the (non-recurrent) episkew infinite words, as 
defined in the next section. 

7.2 Episkew words 

Inspired by the skew words of Morse and Hedlund [91], episkew words were recently defined in 
|62j as non-recurrent infinite words, all of whose factors are (finite) episturmian. The following 
theorem gives a number of equivalent definitions of such words, similar to those for (recurrent) 
episturmian words. 

Theorem 7.3. [62] An infinite word t with Alph(t) = A is episkew if equivalently: 

i) t is non-recurrent and all of its factors are (finite) episturmian; 

ii) there exists an infinite sequence (tW)j>o of non-recurrent infinite words and a directive word 
xix 2 x 3 ■■■ (xiEA) such that = t7- . . , t'( i-1 > = <M tW )> where t'( i-1 > = t^" 1 ) ift^ 
begins with Xi and t'^" 1 ^ = Xjt^*" 1 ) otherwise; 

Hi) there exists a letter x £ A and an epistandard word s on A \ {x} such that t = vfi(s), where 
H is a pure epistandard morphism on A and v is a non-empty suffix of n(s p x) for some 

pen. 

Moreover, t is said to be strict episkew if s is strict on A\ {x}, i.e., if each letter in A\ {x} 
occurs infinitely often in the directive word x\X2Xj, ••• . □ 

A simple example of an episkew word on more than two letters is the infinite word cf = 
cabaababa • • • where f is the Fibonacci word and c is a letter (see also Example 13. 7p . 

Note that the episkew words on a 2-letter alphabet are precisely the skew words. Certainly, 
in the Sturmian case, the word s p xs reduces to a word of the form dPba^ . 

Remark 7.4. Thanks to Richomme |104| . episkew words actually have the following simpler 
characterization: an infinite word t is episkew if and only if t = (p(xs) where s is an epistandard 
word, x is a letter not occurring in s, and <p is a pure episturmian morphism. 

Episkew words were first alluded to (but not explicated) in the recent paper [60]. Following 
that paper, these words showed up again in the study of inequalities characterizing finite and 
infinite episturmian words with respect to lexicographic orderings [62]. In fact, as detailed in the 
next section, episturmian words have similar extremal properties to Sturmian words. See also 
p3J [691 E31 [Ml E3 EH [62] for other work in this direction. 

7.3 Extremal properties 

Suppose the alphabet A is totally ordered by the relation <. Then we can totally order A* by 
the lexicographic order < defined as follows. Given two words u, v £ A + , we have u < v if and 
only if either u is a prefix of v or u = xau' and v = xbv', for some x, u', v' G .A* and letters a, b 
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with a < b. This is the usual alphabetic ordering in a dictionary. We write u < v when u < v and 
u / i>, in which case we say that u is (strictly) lexicographically smaller than v. The notion of 
lexicographic order naturally extends to infinite words in A u . We denote by min(^4) the smallest 
letter in A with respect to the given lexicographic order. 

Let w be a finite or infinite word over A and let k be a positive integer. We denote by 
mm(w\k) (resp. max(w\k)) the lexicographically smallest (resp. greatest) factor of w of length k 
for the given order (where |to| > k if w is finite). If w is infinite, then it is clear that mm(w\k) 
and m&x(w\k) are prefixes of the respective words min(w\k + 1) and max(w\k + 1). So we can 
define, by taking limits, the following two infinite words (see [94J): 

min(u>) = lim mm(w\k) and max(j«) = lim max(w\k). 

fc— >oo k—foo 

That is, to any infinite word t we can associate two infinite words min(t) and max(t) such that 
any prefix of min(t) (resp. max(t)) is the lexicographically smallest (resp. greatest) amongst the 
factors of t of the same length. 

For a finite word w over A and a given order on A, min(w) denotes min(u;|A;) where k is 
maximal such that all mm(w\j), j = 1, 2, . . . , k, are prefixes of mm(w\k). In the case A = {a, b}, 
max(wi) is defined similarly (sec [62J). 

In 2003, Pirillo [93] (see also [93]) proved that, for infinite words s on a 2-letter alphabet {a, b} 
with a < b, the inequality 

as < min(s) < max(s) < 6s (7.1) 

characterizes standard Sturmian words (aperiodic and periodic). Actually, this result was known 
much earlier, dating back to the work of P. Veerman [1121 1113] in the mid 80's. Since that time, 
these 'Sturmian inequalities' have been rediscovered numerous times under different guises, as 
discussed in the forthcoming survey paper [6]. 

Continuing his work in relation to inequality (|7.1|) . Pirillo [93] proved further that, in the case 
of an arbitrary finite alphabet A, an infinite word s £ A w is epistandard if and only if, for any 
lexicographic order, we have 

as < min(s) where a = min(_4). (7.2) 

Moreover, s is a strict epistandard word if and only if (|7.2p holds with strict equality for any 
order [74] . 

In a similar spirit, Glen, Justin, and Pirillo [62j recently established new characterizations of 
finite Sturmian and episturmian words via lexicographic orderings. As a consequence, they were 
able to characterize by lexicographic order all episturmian and episkew words. Similarly, they 
characterized by lexicographic order all balanced infinite words on a 2-letter alphabet; in other 
words, all Sturmian and skew infinite words, the factors of which are (finite) Sturmian. In the 
finite case: 

Theorem 7.5. A finite word w on A is episturmian if and only if there exists a finite word 
u such that, for any lexicographic order, 

au \m\-l — m (7-3) 

where m = min(io) and a = min(*4) for the considered order. □ 
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Example 7.6. Consider the finite word w = baabacababac. For the different orders on {a,b,c}, 
we have 



• a < b < c or a < c < b: min(w) = aabacababac, 

• b < a < c or b < c < a: mm(w) = babac, 

• c < a < b ot c < b < a: mm(w) = cababac. 

It can be verified that a finite word u satisfying ()7.3[) must begin with aba and one possibility is 
u = abacaaaaaa; thus w is a finite episturmian word. 

Note. In the above example, any two orders with the same minimum letter give the same mm(w), 
which is not true in general. 

A corollary of Theorem 17. 51 is the following new characterization of finite Sturmian words (i.e., 
finite balanced words). 

Corollary 7.7. [62j A finite word w on A = {a,b}, a < b, is not Sturmian (in other words, not 
balanced) if and only if there exists a finite word u such that aua is a prefix of mm(w) and bub is 
a prefix of max(w) . □ 

In the infinite case, the following characterization of all infinite words whose factors are finite 
episturmian follows almost immediately from Theorem 17.51 

Corollary 7.8. [62j An infinite word t on A is episturmian or episkew if and only if there exists 
an infinite word u such that, for any lexicographic order, 



Consequently, an infinite word s on {a, b} (a < b) is balanced (i.e., Sturmian or skew) if and 
only if there exists an infinite word u such that 



Corollary 17.81 was recently refined in [58] where it was shown that, for any aperiodic episturmian 
word t, the infinite word u (as given in the corollary) is the unique epistandard word with the 
same set of factors as t. As an easy consequence, we obtain the following characterization of strict 
episturmian words that are infinite Lyndon words (Theorem I7.9p . Recall that a non-empty finite 
word w over A is a Lyndon word if it is lexicographically smaller than all of its proper suffixes 
for the given order < on A. Equivalently, w is the lexicographically smallest primitive word in its 
conjugacy class; that is, w < vu for all non-empty words u, v such that w = uv. The first of these 
definitions extends to infinite words: an infinite word over A is an infinite Lyndon word if and 
only if it is (strictly) lexicographically smaller than all of its proper suffixes for the given order 
on A. That is, a finite or infinite word w is a Lyndon word if and only if w < T l (w) for all i > 0. 

Assuming that \A\ > 1 (since there are no Lyndon words on a 1-letter alphabet), we have: 



Theorem 7.9. [58] An A-strict episturmian word t is an infinite Lyndon word if and only if 

t = as where a = min(^4) for the given order on A and s is an (aperiodic) A-strict epistandard 
word. Moreover, if A = x\X2 • • • E A* 1 is the directive word of s, then t = as is the unique 
episturmian word in the subshift of s directed by the spinned version of A having all spins 1, 



au < min(t) where a = min(^4). 



□ 



au < min(s) < max(s) < feu. 




except when x% = a. 



□ 
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The above theorem is actually a generalization of a result on (aperiodic) Sturmian words given 
by Borel and Laubie [21] (see also [102| ) . 

Let A = {a±, . . . , a m } be an alphabet ordered by a\ < a 2 < • • • < a m . Then Theorem 1 7 . 9 1 savs 
that an *4-strict episturmian word t is an infinite Lyndon word if and only if the (normalized) 
directive word of t belongs to {a±, 02, ... , flm} 1 ^. This can be reformulated as a generalization of 
Proposition 6.4 in [81j : 

Corollary 7.10. [58] An A-strict episturmian word t is an infinite Lyndon word if and only if it 
can be infinitely decomposed over the set of morphisms {ip a , "4>x \ x E A \ {a}} where a = min(.A) 
for the given order on A. □ 

We observe that, contrary to the fact that there exists \A\l possible orders of a finite alphabet 
A, Theorem 1 7 . 9 1 shows that there exist exactly |^4| infinite Lyndon words in the subshift of a given 
.4-strict epistandard word s (when \A\ > 1). That is, for any order with min(.A) = a, the subshift 
of s contains a unique infinite Lyndon word beginning with o, namely as. 

Example 7.11. With A = (abc)^ , the spinned versions (abc)^ , (abc)^ , {abc) u and their 'opposites' 
(obtained by exchange of spins): (abc)^, (afec) a; , (abc)^ direct episturmian words in the subshift 
of the Tribonacci word r. Only the first three of these spinned infinite words direct episturmian 
Lyndon words: ar, br, cr, respectively. 

The above results on strict episturmian Lyndon words have very recently been generalized to 
all episturmian words by Glen, Leve, and Richomme |64| . as follows. 

Theorem 7.12. [M] Let A = {a±, . . . ,a m } be an alphabet ordered by a\ < 02 < ■ ■ ■ < a m and, 
for 1 < i < m, let Bi = {aj, . . . , a m }. An episturmian word t is an infinite Lyndon word if and 
only if there exists an integer j such that 1 < j < m and the (normalized) directive word of w 
belongs to: 

(&* 2ai y • • • (^a J _ 1 )*(5* +1 a,)*(5; +1 {a,} + )-. 

□ 

Note. In the above theorem, the word normalized appears between brackets since one can easily 
verify from Theorem 14.131 that a spinned infinite word of the given form is the unique directive 
word of exactly one episturmian word. 

Example 7.13. [64] Let A = {a, b, c, d}. Then the spinned infinite word (bca){dcb) 2 {dcc) u ' directs 
a Lyndon episturmian word, and so does aa(dc) UJ , but cabadcd^ does not (since this spinned word 
directs a periodic word). 

Remark 7.14. Theorems 14.131 and 17.121 show that any episturmian Lyndon word has a unique 
spinned directive word, but the converse is not true. For example, the regular wavy word {abc) u 
is the unique directive word of the strict episturmian word: 

lim (a) = acabaabacabacabaabaca ■ ■ ■ 

n— >oo c 

which is clearly not an infinite Lyndon word by Theorem 17.121 and also by the fact that acabaaw 
is not a Lyndon word for any order on {a, b, c} and for any word w. 

A key tool used in the proof of Theorem 17.121 was the following result of Richomme, which 
characterizes episturmian morphisms that preserve Lyndon words. A morphism / is said to 
preserve finite (resp. infinite) Lyndon words if for each finite (resp. infinite) Lyndon word w, f(w) 
is a finite (resp. infinite) Lyndon word. 
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Theorem 7.15. jl00|, I103| Let A = {ai, . . . , a m } be an alphabet ordered by a\ < a<i < ■ ■ ■ < a m . 

Then the following assertions are equivalent for an episturmian morphism: 

• / preserves finite Lyndon words; 

• / preserves infinite Lyndon words; 

• / G (% 2 ,...,a m }^ 1 )*{^a m }* where ¥ A = {i> x \xeA}. □ 
7.4 Imbalance 

We now return our attention to the notion of balance. 

Episturmian words on three or more letters are generally unbalanced in the sense of 1-balance, 
except, of course, for those on a 2-letter alphabet, which are precisely the (periodic and aperiodic) 
Sturmian words. In fact, Cassaigne, Ferenczi, and Zamboni [33] have proved, by construction, 
that there exists an episturmian word that is not g-balanced for any q. Note, however, that the 
Tribonacci word is 2-balanced, for example. More generally, it can be shown by induction that 
the k-bonacci word, directed by (a±a2 ■ ■ ■ a^) w , is [k — l)-balanced. Even further, one can prove 
that any linearly recurrent strict episturmian word (or Arnoux-Rauzy sequence) is ^-balanced for 
some q. Linearly recurrent Arnoux-Rauzy sequences were completely described in [1051 132j : they 
are the strict episturmian words for which each letter x occurs in A with bounded gaps. 

Using their main result on return words (Theorem 16. 13f) . Justin and Vuillon [76] proved that 
episturmian words do in fact satisfy a kind of balance property. Specifically: 

Theorem 7.16. \76\ Theorem 5.2] Let s € A u be an epistandard word and let {d, e} be a 2-letter 
subset of A. Then, for any u, v £ F(s) n {d, e}* with \u\ = \v\, we have \\u\d — \v\d\ < 1. □ 

This property of episturmian words reduces to the balance property of Sturmian words when 
A is a 2-letter alphabet (in which case it is characteristic); however, the property is far from being 
characteristic when A consists of more than two letters. 

More recently, Richomme jlOlj also proved that episturmian words and Arnoux-Rauzy se- 
quences can be characterized via a nice 'local balance property'. That is: 

Theorem 7.17. |101j For a recurrent infinite word t 6 A u , the following assertions are equiva- 
lent: 

i) t is episturmian; 

ii) for each factor u of t, there exists a letter a such that AuA D F(t) C auA U Aua; 

Hi) for each palindromic factor u of t, there exists a letter a such that AuA D F(t) C auA U 
Aua. □ 

Roughly speaking, the above theorem says that for any factor u of a given episturmian word 
t, there exists a unique letter a such that every occurrence of u in t is immediately preceded or 
followed by a in t. When \A\ = 2, property ii) of Theorem 17.171 is equivalent to the definition 
of balance. Indeed, Coven and Hedlund [37] stated that an infinite word w over {a, b} is not 
balanced if and only if there exists a palindrome u such that aua and bub are both factors of w. 
As pointed out in [101] . this property can be rephrased as follows: an infinite word w is Sturmian 
if and only if w is aperiodic and, for any factor u of w, the set of factors belonging to AuA is a 
subset of auA U Aua or a subset of buA U Aub. 
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7.5 Fraenkel's conjecture 

As discussed previously, the recurrent balanced infinite words on two letters are exactly the 
Sturmian words (aperiodic and periodic). A natural question to ask is then: "What are the 
balanced recurrent infinite words on more than two letters?" In this direction, Paquin and Vuillon 
[92] recently characterized the balanced episturmian words by classifying these words into three 
families, as follows. 

Theorem 7.18. [92] Any balanced standard episturmian sequence s on a k -letter alphabet Ak = 
{1, 2, . . . , k}, k > 3, belongs to one of the following three families (up to letter permutation): 

i) s = p(k - l)p(kp(k - l)p) u , with p = Pal(l n 2 ■ ■ ■ (k - 2)); 

ii) s = p(k — l)p(kp(k — 1)^)^, with 

p = Pal(123 ■■■(k-i- \)\(k — £)■■■ (k — 2)); 

Hi) s = [Pal(123---k)] UJ . □ 

The importance of the above result lies in the fact that it supports Fraenkel 's conjecture [56] : 
a problem that arose in a number-theoretic context and has remained unsolved for over thirty 
years. Fraenkel conjectured that, for a fixed k > 3, there is only one covering of Z by k Beatty 
sequences of the form ([_cm + /?J)n>l> where a, f3 are real numbers. A combinatorial interpretation 
of this conjecture may be stated as follows (taken from [92]). Over a fc-letter alphabet with k > 3, 
there is only one recurrent balanced infinite word, up to letter permutation and shifts, that has 
mutually distinct letter frequencies. This supposedly unique infinite word is called Fraenkel's 
sequence and is given by {Fk) u where the Fraenkel words (i^)i>i are defined recursively by F± = 1 
and Fi = Fi_\iFi_i for all i > 2. (Note that Ff. = Pal(12 ■ ■ ■ k).) For further details, see for 
instance [921 1110] and references therein. 

Amongst the classes of balanced episturmian words given in Theorem 17.181 only one class 
has mutually distinct letter frequencies and, up to letter permutation and shifts, corresponds to 
Fraenkel's sequence. That is: 

Theorem 7.19 (Paquin- Vuillon [92]). Suppose t is a balanced episturmian word with Alph(t) = 
{1, 2, . . . , k}, k > 3. /ft has mutually distinct letter frequencies, then up to letter permutation, t 
is a shift of (F k ) u . □ 

More recently, it was proved in [61] that any recurrent balanced rich infinite word is necessarily 
episturmian, and hence such words obey Fraenkel's conjecture (recall that rich words were defined 
Section [622]). 

Remark 7.20. An interesting known fact (e.g., see [68J) is that any balanced recurrent infinite 
word x on k > 3 letters having mutually distinct letter frequencies is necessarily periodic. Cer- 
tainly, the image of x under any morphism of the form: (a a, other x i— » b) is a Sturmian word. 
If, for one letter, the corresponding Sturmian word is aperiodic (i.e., x has irrational slope as a 
cutting sequence), then we meet impossibility; thus rather easily x must be periodic. 

8 Concluding remarks 

In closing, we mention a number of very recent works involving episturmian words. 
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Rigidity: Krieger [78] has shown that any strict purely morphic epistandard word s is rigid. 
That is, all of the morphisms that generate s are powers of the same unique (epistandard) 
morphism. Krieger also showed that a certain class of 'ultimately strict' purely morphic 
epistandard words are not rigid, but it remains an open question as to whether or not all 
strict morphic episturmian words are rigid. 

Quasiperiodicity: A finite or infinite word w is said to be quasiperiodic if there exists a word 
u (with u 7^ w for finite w) such that the occurrences of u in w entirely cover w, i.e., every 
position of w falls within some occurrence of u in w. Such a word u is called a quasiperiod 
of w. For example, the word w = abaababaabaababaaba has quasiperiods aba, abaaba, 
abaababaaba. 

In the last fifteen years, quasiperiodicity and coverings of finite words has been extensively 
studied (see [9] for a brief survey on quasiperiodicity in 'strings'). Quasiperiodic finite 
words were first introduced by Apostolico and Ehrenfeucht in [10]. The notion was later ex- 
tended to infinite words by Marcus [85] who opened some questions, particularly concerning 
quasiperiodicity of Sturmian words. After a brief answer to some of these questions in |79j . 
the Sturmian case was fully studied by Leve and Richomme [81] who proved that a Sturmian 
word is non-quasiperiodic if and only if it is an infinite Lyndon word. The study of quasiperi- 
odicity in Sturmian words was very recently extended to episturmian words by Glen, Leve, 
and Richomme [581 l64l 180] . who have completely described all of the quasiperiods of an 
episturmian word, yielding a characterization of quasiperiodic episturmian words in terms 
of their directive words. They have also characterized episturmian morphisms that map any 
word onto a quasiperiodic one. These results show that, unlike the Sturmian case, there 
exist non-quasiperiodic episturmian words that are not infinite Lyndon words. Key tools 
used in the study of quasiperiodicity in episturmian words were episturmian morphisms, 
normalized directive words (recall Theorem I4.12p . and the following equivalent definition of 
quasiperiodicity in terms of return words introduced by Glen in [58j: a finite word v is a 
quasiperiod of an infinite word w if and only if v is a recurrent prefix of w such that all of 
the returns to v in w have length at most \v\. 

In [89], Monteil proved that any Sturmian subshift contains a multi-scale quasiperiodic word, 
i.e., an infinite word having infinitely many quasiperiods. A shorter proof of this fact was 
provided in [81] and this result has also been proven true for episturmian words in [64j . 

For more recent work on quasiperiodicity, see for instance \89\ [90] . 

^-episturmian words: Recall that an infinite word is episturmian if and only if its set of factors 
is closed under reversal and it has at most one left special factor of each length. With this 
definition in mind, Bucci, de Luca, De Luca, and Zamboni |27l [28j have recently introduced 
and studied a further extension of episturmian words in which the reversal operator is 
replaced by an arbitrary involutory antimorphism (i.e., a map 9 : A* — > .A* such that 9 2 = 
Id and 9(uv) = 9(v)9(u) for all u, v £ A*). More precisely, an infinite word over A is said to 
be 0-episturmian if it has at most one left special factor of each length and its set of factors 
is closed under an involutory antimorphism 9 of the free monoid A*. Generalizing even 
further, 9 -episturmian words with seed are obtained by requiring the condition on special 
factors only for sufficiently large lengths (see [28j). 
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